AD-AQ97  714 
UNCLASSIFIED 


ILLINOIS  UNXV  AT  URBANA  COMPUTER-BASED  EDUCATION  RESE— ETC  F/G  12/1 
SOME  COMPARISONS  OF  FOUR  OROER-ANALYTIC  METHODS  AND  FACTOR  ANAL— ETC(U) 
FE0  81  S  L  WISE  N00014-79-C-0752 

RR-81-2  ML 


AD  A  0  97  7  1  4 


LEVEIX 


Computer-based  Education 
Research  Laboratory 


University  of  Illinois 


Urbana  Illinois 


SOME  COMPARISONS  OF  FOUR 
ORDER-ANALYTIC  METHODS  AND  FACTOR 
ANALYSIS  FOR  ASSESSING  DIMENSIONALITY 

DT!C 


STEVEN  L  WISE 


_ — 


APR  1  4  1981 


Approved  for  public  release;  distribution  unlimited. 
Reproduction  in  whole  or  in  part  permitted  for  any 
purpose  of  the  United  States  Government. 


This  research  was  sponsored  by  the  Personnel  and  Training 
Research  Program,  Psychological  Sciences  Division,  Office 
of  Naval  Research,  under  Contract  No.  N00014-79-C-0752. 
Contract  Authority  Identification  Humber  NR  150-415. 


COMPUTERIZED  ADAPTIVE  TESTING  AND  MEASUREMENT 


RESEARCH  REPORT  SI -2 
FEBRUARY  1981 


Unclassified _ 

SECURITY  CLASSIFICATION  OF  THIS  PAGE  QWl»n  D«<«  gnl«f«Q  _  _  _ _ 

- - ^-....rnriTiAn  Birc  T  READ  INSTRUCTIONS 

REPORT  DOCUMENTATION  PAGE _ before  completing  form 

ILrEPORT_N  UMBER - -  [j.  OOVT  ACCESSION  NO.|  3  RECIPIENT'S  CATALOG  NUMBER 


Research  ^.epfit#JSl-2 


T  TiTlE  (md  SuBTffft) 

Some  Comparisons  of  Four  Order-Analytic  Methods 
and  Factor  Analysis  for  Assessing  Dimension¬ 
ality. 

7.  AuTHOPft; 

/f/\iSteven  L./Wise  (  1^ 


5.  Type  OF  REPORT  &  PERIOD  COVERED 


S.  PERFORMING  ORG.  REPORT  NUMBER 


S.  CONTRACT  OR  GRANT  NUMBER! 


N00pl4-79-C-^752  L 


»  PERFORMING  ORGANIZATION  NAME  ANO  ADDRESS  ^  ,0 

Computer-based  Education  Research  Laboratory  ^ 
University  of  Illinois  ,/XjfTi  R 

Urbana,  IL  61801  '  'J 

U.  CONTROLLING  OFFICE  NAME  AND  ADDRESS  ‘ 

Personnel  and  Training  Research  Programs  '  / 1  J 

nff-fcp  nf  Naval  Rpsearrh  - - *  •*- 


J.  PROGRAM  E-EMEnT.  project,  task 
AREA  A  WORK  UNIT  NUMBERS 

6115m-  RRQ42-04 


Jf|j  RR/04^4^)l9'  NR154-445 


12.  REPORT  DATE 

FehgSBffi  1981 

12.  NUMBER  OF  PAGES  /" 


Office  of  Naval  Research  (Code  458)  " —  ,a-  number  of  pages  f  j/ .  JJ 

Arlington,  VA  22217 _ _ _ _ - — -cjnf  ■  -e 

4,  MONITORING  AGENCY  N  AME  »  ADDRESS!/!  dIHtrtnl  from  Controlling  Olllct)  IS.  SECURITY  CLASS,  (ol  I hit  »P "V 


Unclassified 


.  DECL  ASSlFlC  ATlON/  DOWNGRADING 
SCHEDULE 


f  16.  DISTRIBUTION  STATEMENT  (of  ihio  Report) 


Approved  for  public  release;  distribution  unlimited 


\\T.  DISTRIBUTION  STATEMENT  (at  thm^abatrrnct  mntmrmd  in  Block  20,  It  dtttmront  from  Report) 


fir  SUPPLEMENTARY  NOTES 


19.  KEY  WORDS  fCondnu*  on  r«v«r««  atdm  J/  n#c#M«y  mtd  tdontlty  by  block  numbmr) 

dimensionality,  order  analysis,  latent-trait  theory,  ordering  theory, 
consistency,  factor  analysis 


0.  (Contlnuo  on  rmvrao  oldo  II  ntcwfy  and  identify  by  block  ntmtbor) 

Current  latent-trait  methods  require  that  the  latent  space  underlying 
a  group's  test  performance  be  unidimensional.  However,  many  tests  yield 
multidimensional  data,  implying  that  more  than  one  latent  trait  would  be 
necessary  to  account  for  test  performance.  A  possible  solution  to  this 
problem  of  multidimensionality  would  be  to  isolate  unidimensional  subsets 
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of  items  from  the  total  set  of  test  items  and  use  item  response  theory 
with  each  subset.  While  factor  analysis  is  the  most  commonly  proposed  pro¬ 
cedure  for  determining  dimensionality,  a  recently  developed  procedure  called 
order  analysis  may  also  prove  to  be  useful  for  isolating  unidimensional 
item  sets. 

The  f  &  study  in  this  report  dealt  with  a  comparison  of  three  order 
analysis  procedures:  Krus  &  Bart’s  (1974)  procedure  and  Reynolds'  (1976) 
procedures  using  two  of  Cliff's  (1977)  consistency  indices,  c  ^  and  c^i 
respectively.  The  comparisons  were  based  on  seven  simulated  datasets  with 
known  factorial  dimensionality,  and  two  multidimensional  sets  of  mathematics 
data.  The  c  procedure  reproduced  the  factor  structure  for  all  of  the 
simulated  datasets,  while  the  other  two  procedures  performed  very  poorly. 
However,  for  the  mathematics  data,  all  three  procedures  failed  to  repro¬ 
duce  the  factors. 

The  second  study  in  this  report  presents  preliminary  results  using  a  new 
order-analysis  procedure  which  solves  some  of  the  difficulties  with  the 
other  procedures  in  reproducing  factorial  dimensionality.  This  new  proce¬ 
dure  (dubbed  ORDO)  reproduced  the  factors  for  the  mathematics  data  as  well 
as  for  the  simulated  data.  It  is  hoped  that  ORDO  will  represent  a  useful 
alternative  to  factor  analysis  for  determining  unidimensional  item  sets 
appropriate  for  latent-trait  methods. 
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SOME  COMPARISONS  OF  FOUR  ORDER-ANALYTIC 
METHODS  AND  FACTOR  ANALYSIS  FOR  ASSESSING  DIMENSIONALITY 

Steven  L.  Wise 

ABSTRACT 

Current  latent-trait  methods  require  that  the  latent  space 
underlying  a  group's  test  performance  be  unidimensional.  However, 
many  tests  yield  multidimensional  data,  implying  that  more  than  one 
latent  trait  would  be  necessary  to  account  for  test  performance.  A 
possible  solution  to  this  problem  of  multidimensionality  would  be  to 
isolate  unidimensional  subsets  of  items  from  the  total  set  of  test 
items  and  use  item  response  theory  with  each  subset.  While  factor 
analysis  is  the  most  commonly  proposed  procedure  for  determining 
dimensionaltiy ,  a  recently  developed  procedure  called  order  analysis 
may  also  prove  to  be  useful  for  isolating  unidimensional  item  sets. 

The  first  study  in  this  report  dealt  with  a  comparison  of  three 
order  analysis  procedures:  Krus  &  Bart's  (1974)  procedure  and 
Reynolds'  (1976)  procedures  using  two  of  Cliff's  (1977)  consistency 
indices,  c  ^  and  c^*  respectively.  The  comparisons  were  based  on 
seven  simulated  datasets  with  known  factorial  dimensionality,  and 
two  multidimensional  sets  of  mathematics  data.  The  c  ^  procedure 


reproduced  the  factor  structure  for  all  of  the  simulated  datasets. 


while  the  other  two  procedures  performed  very  poorly.  However,  for  the 
mathematics  data,  all  three  procedures  failed  to  reproduce  the  factors. 

The  second  study  in  this  report  presents  preliminary  results 
using  a  new  order-analysis  procedure  which  solves  some  of  the  diffi¬ 
culties  with  the  other  procedures  in  reproducing  factorial  dimension¬ 
ality.  This  new  procedure  (dubbed  ORDO)  reproduced  the  factors  for 
the  mathematics  data  as  well  as  for  the  simulated  data.  It  is  hoped 
that  ORDO  will  represent  a  useful  alternative  to  factor  analysis  for 
determining  unidimensional  item  sets  appropriate  for  latent-trait 
methods . 


Introduction 


A  major  issue  in  item  response  theory  concerns  determining  the 
number  of  latent  dimensions  (traits)  needed  to  adequately  account  for 
the  test  performance  of  a  group  of  individuals.  If  all  of  the  relevant 
dimensions  are  not  accounted  for,  then  the  requirement  of  local 
independence  of  items  will  not  hold  and  the  item  response  model  will 
be  intractable.  This  problem  is  compounded  by  current  practical 
limitations  of  item  response  theory.  While  there  have  been  multidimen¬ 
sional  latent  trait  models  proposed,  estimation  problems  arising 
from  these  models  have  rendered  them  all  but  useless  in  the  field. 
Hence,  the  current  state  of  affairs  regarding  item  response  theory 
prevents  one  from  considering  more  than  one  latent  trait  at  a  time. 

This  means  that  the  latent  space  under  consideration  has  to  be  uni¬ 
dimensional  in  order  to  be  practicable.  However,  many  tests  yield 
multidimensional  data,  implying  that  more  than  one  latent  trait  would 
be  necessary  to  account  for  test  performance. 

One  possible  solution  to  this  problem  of  multidimensionality 
would  be  to  extract  unidimensional  subsets  of  items  from  the  larger, 
multidimensional  set  of  items,  and  use  item  response  theory  to  generate 
separate  ability  estimates  from  each  subset.  The  most  commonly  pre¬ 
scribed  method  of  determining  the  dimensionality  of  a  set  of  items  is 
factor  analysis.  However,  Krus  (1975)  points  out  that  factor  analysis 
methods  contain  a  considerable  amount  of  indeterminancy  due  to  a 
relative  lack  of  consensus  regarding  such  issues  as  (1)  appropriate 
factor  extraction  method,  (2)  the  problem  of  communality  estimation, 
and  (3)  the  number  of  factors  to  extract.  Krus  has  suggested  use  of 
order  analysis  as  an  alternative  to  factor  analysis  in  determining 
the  dimensionality  of  a  set  of  data. 

Order  analysis  (Krus,  Bart,  &  Airasian,  1975;  Krus,  1975)  was 
developed  to  investigate  logical  relations  between  the  elements  of 
a  binary  data  matrix.  The  method  presumes  that  elements  measuring 
a  single  dimension  show  characteristics  of  a  strong  simple  order, 
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i.e.,  that  the  relations  between  the  elements  are  transitive,  asym¬ 
metric,  and  connected  (see  Coombs,  Dawes,  and  Tversky,  1970). 

The  relation  of  interest  in  order  analysis  is  dominance-  If 
a  person  fails  item  i  and  passes  item  j,  then  item  i  is  said  to 
dominate  item  j  for  that  person.  This  follows  from  transitivity; 
since  the  person  is  dominated  by  item  i  (fails  item  i)  and  the  person 
dominates  item  j  (passes  item  j)  then  it  is  implied  that  item  i 
dominates  item  j.  This  will  be  called  an  ij  dominance. 

If  there  is  a  one-dimensional  latent  attribute  underlying  the 
behavior  reflected  by  the  data,  then  the  item  relations  will  be 
consistent  across  persons  (Coombs,  et  al.,  1970).  Hence,  for  any 
items  i  and  j,  all  persons  should  show  either  an  ij  dominance,  or  they 
should  all  show  a  ji  dominance.  Lack  of  consistency  across  persons 
is  in  violation  of  the  order-analytic  model.  However,  since  there 
are  usually  errors  of  measurement  present  in  the  data  matrix,  some 
amount  of  inconsistency  is  tolerated.  Krus  et  al.,  (1975)  proposed 
the  use  of  McNemar's  (1947)  ^  statistic  for  correlated  proportions  to 
evaluate  the  preponderance  of  ij  dominances  over  ji  dominances.  If 
the  value  of  is  sufficiently  large,  then  item  i  is  concluded  to 
dominate  item  j  for  the  entire  group.  It  is  also  assumed  that  the 
ji  dominances  are  due  to  error.  In  the  case  where  there  is  a  single 
order  present  in  the  set  of  items,  each  item  will  dominate  all  items 
"below"  it  in  the  order,  and  transitivity,  asymmetry  and  connectedness 
will  all  be  realized.  This  set  of  items,  also  called  a  chain,  will 
essentially  form  a  Guttman  scale. 

There  are  times,  however,  when  the  z_  value  between  two  items  i 
and  j  does  not  indicate  a  clear  ij  dominance  or  ji  dominance.  This 
violates  the  connectedness  property  that  there  must  be  a  relation 
between  each  pair  of  items  in  the  order.  According  to  Krus  (1975), 
this  indicates  that  items  i  and  j  are  not  members  of  the  same  order, 
and  that  the  data  are  multidimensional.  Based  on  this,  a  deterministic 
order-analytic  model  for  determining  the  dimensionality  of  an  item 
set  was  developed  (Krus  &  Bart,  1974),  and  later  a  probabilistic 
model  (Krus,  1977). 
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Cliff  (1977)  developed  a  number  of  indices  to  assess  the  consistency 
of  simple  orders.  The  first,  c^,  reflects  the  proportion  of  the  total 
number  of  dominances  in  a  dataset  which  are  consistent  with  a  particular 
ordering.  Another  important  index,  c^j  *-s  similar  to  c  ^  except  that 
it  contains  an  adjustment  for  the  number  of  dominances  expected  by 
chance  for  independent  items.  It  is  equivalent  to  Loevinger's  (1947) 
index  of  homogeneity. 

Reynolds  (1976)  rejected  the  approach  of  using  McNeraar's  z^ 
test  to  evaluate  the  relation  between  items  and  then  using  the  relations 
to  generate  item  chains.  He  pointed  out  that  Krus  and  Bart's  (1974) 
deterministic  method  does  not  necessarily  yield  a  unique  set  of  item 
chains  and  that  other,  more  "optimal"  chains  may  also  be  extracted. 
Reynolds  also  noted  that  the  Krus  and  Bart  procedure  lacked  any 
goodness-of-fit  statistics  to  evaluate  how  well  an  ordering  is  consis¬ 
tent  across  persons.  Reynolds  outlined  an  algorithm,  using  one  of 
Cliff's  (1977)  consistency  indices,  to  extract  item  chains.  Each 
item  in  the  set  is  used  as  a  starting  point  in  a  chain.  The  most 
consistent  items  are  then  successively  added  to  the  chain  until  the 
overall  chain  consistency  index  value  drops  below  some  minimally 
acceptable  level.  Redundant  chains  are  then  deleted,  and  the  remaining 
chains  are  interpreted  as  representing  the  dimensions  of  the  dataset. 

Earlier  studies  have  failed  to  show  a  consistent  relationship 
between  the  results  of  order  analysis  and  factor  analysis.  Krus  and 
Weiss  (1976)  found  congruence  between  the  two  methods  for  Thurs tone's 
1947,  p.  140-143)  "box  data",  however,  when  they  analyzed  random 
data  using  Armstrong  and  Soelberg's  (1968)  method,  they  found  differing 
results  using  order  analysis  and  factor  analysis.  Bart  (1978)  reana¬ 
lyzed  the  data  reported  in  Bock  and  Lieberman  (1970)  and  concluded 
that  the  factor  structure  of  a  set  of  data  did  not  appear  to  relate  in 
a  clear  way  to  the  order  structure. 
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Study  I 

The  purpose  of  the  first  study  was  to  compare  different  order- 
analysis  procedures  on  a  number  of  datasets  with  varying  factorial 
dimensionality.  Seven  simulated  dichotomous  datasets  were  generated. 

These  datasets  differed  both  in  terms  of  number  of  common  factors  and 
in  terms  of  variance  of  the  item  difficulty  levels.  Also,  two  datasets 
composed  of  signed-numbers  mathematics  items  (described  more  fully 
in  Birenbaum  and  Tatsuoka  (1980))  were  used  in  comparing  the  order- 
analysis  procedures.  These  analyses  could  aid  in  the  understanding 
of  the  differences  among  the  procedures,  as  well  as  providing  insight 
regarding  which  procedure  would  be  most  useful  in  extracting  sets  of 
items  which  satisfy  the  unidiraensionality  assumption  of  current  latent- 
trait  models  (Lord  &  Novick,  1968). 

t 


Method 


Simulated  Datasets 

Seven  simulated  dichotomous  datasets  were  generated  using  the 
FORMAL  and  TUCKLIB  packages  of  FORTRAN  subroutines  at  the  University 
of  Illinois.  Each  dataset,  which  consisted  of  10  items  and  500 
persons,  was  computed  as  follows.  A  factor  pattern  matrix  and  a 
vector  of  uniquenesses  were  specified  by  the  user.  From  this  infor¬ 
mation  a  population  variance-covariance  matrix  was  generated  using 
a  modified  Tucker,  Koopman,  and  Linn  (1969)  procedure  which 
simulated  the  effects  of  random  error  on  the  variance-covariance 
matrix  by  allowing  for  the  influence  of  a  number  of  minor  random 
factors.  This  population  variance-covariance  matrix  was  then  used  in 
conjunction  with  a  vector  of  user-specified  population  item  means  to 
generate  dichotomous  item  scores  from  a  multivariate  normal  population. 

The  seven  simulated  datasets  are  described  in  Table  1.  It 
was  decided  that  the  distributions  of  item  difficulty  levels  might  have 
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Table  1 

Descriptions  of  the  Simulated  Datasets 
Dataset  Label _ Description  (10  items,  N=5Q0) 


HI 

One 

Ml 

One 

LI 

One 

H2 

Two 

M2 

Two 

L2 

Two 

M10 

Cons 

between  the  means  (essentially  a  10-dimensional  dataset) , 


Table  2 

Examples  of  the  16  Signed-Number 

Mathematics  Skills 

Item  (Skill) 

Example 

Operation 

1 

1  -  (-10)  =  11 

Subtraction 

2 

9  -  (-7)  =  16 

Subtraction 

3 

-7  -  9  =  -16 

Subtraction 

4 

-12  -  3  =•  -15 

Subtraction 

5 

-3  -  12  -  -15 

Subtraction 

6 

-6  -  (-8)  =  2 

Subtraction 

7 

-16  -  (-7)  =  -9 

Subtraction 

8 

8-6  =  2 

Subtraction 

9 

2  -  11  =  -9 

Subtraction 

10 

6  +  4  =  10 

Addition 

11 

-14  +  (-5)  =  -19 

Addition 

12 

-5  +  (-7)  =  -12 

Addition 

13 

-3  +  12  =  9 

Addition 

14 

-6  +  4  =  -2 

Addition 

15 

12  +  (-3)  =  9 

Addition 

16 

3  +  (-5)  =  -2 

Addition 
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a  differential  effect  on  the  order-analysis  procedure  -.  Hence,  three 
types  of  item  mean  distributions  were  used:  Highly  spaced  means  where 
each  item  difficulty  level  is  very  distinct  from  that  of  the  other 
items,  moderately  spaced  means  where  some  item  difficulty  levels  are 
similar,  and  means  which  had  the  same  population  difficulty  level 
but  whose  differences  in  sample  difficulty  levels  were  due  only  to 
random  variation.  Also,  for  the  two-factor  datasets  (H2,  M2,  L2) 
items  1-4  always  loaded  on  one  factor,  and  items  5  -  10  loaded  on 
the  other  factor. 

Dataset  M10  was  unique  in  that  it  was  generated  so  that  there 
were  no  common  factors  among  the  items.  It  consisted  of  10  unrelated 
items  with  moderately  spaced  means.  This  dataset  was  useful  in 
comparing  order-analysis  procedures  in  their  abilities  to  indicate  a 
lack  of  order  structure. 

Mathematics  Data 

The  mathematics  dataset  consisted  of  16  dichotomous  mastery 
scores  derived  from  a  64-item  signed-numbers  test  administered  to 
125  eighth  grade  students  during  November,  1979.  There  were  16 
skills,  each  measured  by  four  parallel  items.  Examples  of  these 
skills  are  shown  in  Table  2.  If  a  student  got  a  least  three  of  the 
four  items  correct,  he  or  she  was  deemed  a  master  of  that  skill  and 
given  a  mastery  score  of  one.  Otherwise,  a  score  of  zero  was  given 
(non-mastery) . 

Two  forms  of  the  mathematics  dataset  were  analyzed.  Birenbaum 
and  Tatsuoka  (1980)  describe  a  procedure  for  detecting  inappropriate 
strategies  used  by  students  in  solving  signed-number  problems.  Often, 
students  can  get  "correct"  answers  to  some  of  these  problems  using 
incorrect  strategies.  Once  an  incorrect  strategy  was  detected  for 
a  given  student,  it  was  possible  to  determine  the  items  for  which  the 
student  would  have  given  the  correct  answer  using  the  inappropriate 
strategy.  An  "adjusted"  dataset  was  then  constructed  from  the  original 
64-item  mathematics  dataset  such  that  items  deemed  to  have  been  gotten 


correct  by  an  inappropriate  strategy  were  rescored  as  incorrect. 
Dichotomous  mastery  scores  were  then  recomputed  for  the  adjusted 
dataset.  Order  analyses  were  subsequently  performed  on  both  the 
unadjusted  (UMATH)  and  adjusted  (AMATH)  16-item  mastery  datasets. 

Order- Analysis  Procedures 

Three  order-analysis  procedures  were  used:  the  deterministic 
order-analysis  method  of  Krus  and  Bart  (1974),  Reynolds'  (1976) 
algorithm  using  c  ^  as  an  extraction  index,  and  Reynolds'  procedure 
using  ct;j*  To  determine  the  presence  of  a  relation  in  Krus  and 
Bart's  procedure,  a  criterion  McNemar's  jz  value  of  1.64  was  used. 

Krus'  (1977)  probabilistic  order-analysis  procedure  was  not  used  for 
two  reasons.  First,  it  was  decided  that  the  results  obtained  from 
the  deterministic  and  probabilistic  models  would  be  similar  enough 
that  both  procedures  would  not  be  necessary  in  this  study.  Second, 
since  Reynolds'  (1976)  method  is  deterministic,  the  deterministic 
order-analysis  method  was  chosen  in  order  to  permit  the  most  straight¬ 
forward  comparisons  among  the  results  of  the  different  methods. 


Results 

Simulated  Data 

In  order  to  verify  the  factor  structures  of  the  simulated 
datasets,  simple  common  factor  analyses  of  the  matrices  of  phi 
coefficients  were  performed.  For  datasets  where  more  than  one  common 
factor  was  extracted,  factors  were  rotated  using  the  Varimax  criterion. 
The  results  of  these  factor  analyses,  along  with  the  item  means  and 
standard  deviations,  are  shown  in  Appendices  1  through  7.  All  seven 
datasets  showed  clear  factorial  dimensionality  in  agreement  with  the 
factor  pattern  matrices  from  which  the  datasets  were  generated.  For 
dataset  M10,  a  scree  test  of  the  eigenvalues  led  to  the  conclusion 
that  no  common  factors  were  present. 


All  of  the  items  loaded  on  a  single  factor 


The  order-analysis  results  for  datasets  HI,  Ml,  and  LI  are 
shown  in  Table  3,  For  HI,  all  three  procedures  correctly  extracted 
a  single  chain  (dimension)  of  items.  For  Ml,  the  three  procedures  were 
not  in  agreement.  While  the  single  chain  was  correctly  extracted 
using  c^j  use  of  the  other  two  procedures  yielded  multiple  chains. 
However,  if  the  minimum  consistency  level  of  c^  is  lowered  from  .90 
to  .86,  then  the  correct  single  chain  would  have  been  extracted  for 
the  c  procedure.  For  dataset  LI,  composed  of  items  which  were 
highly  similar  in  terms  of  difficulty  level,  the  c  ^  procedure  was  the 
only  procedure  which  extracted  the  single  dimension.  The  other  two 
procedures  failed  to  determine  any  item  chains.  Note  that  the  overall 
value  of  c^  was  near  zero,  while  for  c^  it  was  fairly  high. 

Table  4  shows  the  order-analysis  results  for  datasets  H2,  M2, 
and  L2.  For  H2  Krus  &  Bart's  (1974)  procedure  could  not  accurately 
extract  the  two  factors.  Items  6,  7,  and  9  were  incorrectly  combined 
in  a  chain  with  items  1,  2,  3,  and  4.  Reynolds'  procedure  extracted 
the  correct  chains  when  either  c^  or  c^  was  used.  For  M2,  however, 
only  the  c  ^  procedure  extracted  the  two  dimensions.  The  c^ 
procedure  extracted  one  of  the  dimensions,  but  could  not  extract  the 
other.  The  results  for  Krus  and  Bart's  procedure  were  chaotic  in 
terms  of  the  factor  structure  of  this  dataset.  For  dataset  L2, 
as  for  LI,  only  the  c  ^  procedure  correctly  extracted  the  two  dimensions. 
The  other  two  procedures  failed  to  combine  any  items  into  chains. 

The  chain  extraction  results  for  M10,  shown  in  Table  5, 
illustrated  other  differences  among  the  three  procedures.  In  this 
dataset,  there  were  no  real  common  factors  present  among  the  items. 

The  c  ^  procedure  extracted  no  chains  at  all.  Krus  and  Bart's 
procedure,  however,  yielded  a  large  (8-item)  chain,  and  the 
ctl  procedure  yielded  a  number  of  small  chains. 


Item  Chain  Extraction  for  Dataset  M10 
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Mathematics  Data 

Factor  analyses  of  the  matrices  of  phi  co.fficlents  for  the  two 
mathematics  datasets  are  shown  in  Tables  6  and  7.  For  the  UMATH 
dataset,  two-factor  solution  is  presented,  although  a  scree  test  of 
the  eigenvalues  did  not  clearly  suggest  the  number  of  factors  to 
extract.  Factor  solutions  were  obtained  for  two  through  five  factors, 
and  the  two-factor  solution  best  approximated  simple  structure.  The 
subtraction  items  (1-9)  comprised  one  factor,  while  four  of  the 
addition  items  (IT  -  16)  comprised  the  second  factor.  The  four 
second-factor  items  were  all  skills  dealing  with  the  addition  of 
two  numbers  that  were  opposite  in  sign. 

However,  when  the  data  were  adjusted  tor  presumably  erroneously 
correct  responses  (AMATH) ,  two  clear  factors  of  subtraction  and 
addition  emerged.  Only  two  eigenvalues  were  greater  than  one,  and 
a  very  clear  simple  structure  was  present.  The  correlation  between 
the  two  factors  was  .46. 

Order  analyses  of  the  mathematics  data,  shown  in  Table  8,  gave 

very  different  results  from  those  of  the  factor  analyses.  For  both 

datasets,  neither  the  Krus  4  Bart  procedure  nor  the  c  ^  procedure 

yielded  chains  that  showed  any  resemblance  to  the  factors.  The 

c  ^  procedure  also  failed  to  reproduce  the  factor  structure  for 

either  dataset.  For  AMA1H  in  particular,  the  c  ,  procedure  found 

t  3 

one  chain  with  fairly  high  overall  consistency  (c^  =  .  764). 


Discussion 

It  quickly  became  clear  from  the  results  of  the  simulated  data 
that  Krus  4  Bart’s  (1974)  procedure  did  not  perform  very  well  in 
reproducing  the  factor  structures  of  the  datasets.  The  c  j  procedure 
did  not  fare  much  better;  it  reproduced  the  factor  structures  only  for 
datasets  HI  and  H2.  Basical lv  there  are  two  reasons  for  the  poor 
results  from  these  two  procedures.  First,  when  a  factor  contains  two 
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Table  6 


Simple  Common  Factor  Analysis  of  Phi 
Coefficients  for  the  Unadjusted  Mathematics  (UMATH)  Dataset 


Item 

Mean 

S.D. 

Factor  I 
loadings 

Factor  II 
loadings 

Eigenvalues 

1 

.648 

.480 

.817 

-.119 

5.449 

2 

.6  80 

.468 

.847 

-.120 

2.245 

3 

.584 

.495 

.696 

-.078 

1.665 

4 

.576 

.496 

.693 

-.162 

1.223 

5 

.720 

.451 

.891 

-.096 

1.028 

6 

.744 

.438 

.713 

.082 

.757 

7 

.824 

.382 

.617 

.068 

.712 

8 

.856 

.352 

.485 

-.025 

.545 

9 

.704 

.453 

.635 

.118 

.456 

10 

.992 

.089 

.119 

.037 

.407 

11 

.912 

.284 

.368 

.  159 

.371 

12 

.936 

.246 

.352 

.060 

.338 

13 

.920 

.272 

.038 

.765 

.252 

14 

.944 

.231 

-.011 

.591 

.238 

15 

.920 

.272 

-.035 

.509 

.  165 

16 

.920 

.272 

.051 

.684 

.  150 

Note:  Factors  were  rotated  using  the  Oblimin  method. 


Table  7 

Simple  Common  Factor  Analysis  of  Phi 
Coefficients  for  the  Adjusted  Mathematics  (AMATM)  Dataset 


Item 

Mean 

S.D. 

Factor  l 
load  Lngs 

Factor  II 
loadings 

Eigenvalues 

l 

.600 

.492 

.834 

.015 

8.216 

2 

.624 

.486 

.865 

.007 

2.881 

3 

.536 

.501 

.771 

-.022 

.836 

4 

.  536 

.501 

.770 

-.024 

.682 

5 

.  638 

.465 

.937 

.030 

.608 

6 

.664 

.474 

.895 

.041 

.501 

7 

.696 

.462 

.920 

.050 

.  396 

8 

.  792 

.408 

.679 

-.068 

.  370 

9 

.648 

.480 

.749 

.068 

.  34  7 

10 

.960 

.197 

.  144 

.  353 

.  285 

1 1 

.838 

.317 

.  108 

.745 

.251 

12 

.904 

.296 

.069 

.786 

.225 

13 

.888 

.317 

-.086 

.858 

.172 

14 

.896 

.  306 

-.008 

.850 

.099 

15 

.872 

.  335 

-.094 

.814 

.08  7 

16 

.880 

.326 

-.014 

.833 

.045 

Item  Chain  Extraction  for  the  Mathematics  Datasets 


Note:  Cutoff  values  of  c  .  and  c  used  were  .90  and  .70  respectively 
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or  more  items  with  highly  similar  difficulty  levels,  all  of  these 
items  will  frequently  not  appear  on  the  same  chain.  Itans  that  are 
too  close  together  in  terms  of  difficulty  will  often  fail  to  show 
a  clear  dominance  relation,  indicated  by  a  low  value  of  McNeraar's  z_. 

Hence,  by  Krus  &  Bart's  procedure,  this  absence  of  a  relation  will 

imply  that  the  items  do  not  belong  to  the  same  dimension.  Correspondingly, 

the  low  z_  value  also  means  that  c  ^  between  the  items  will  also  be 

very  low.  Thus,  items  similar  in  difficulty  often  show  very  inconsistent 

dominance  relations. 

The  second  problem  with  Krus  £  Bart's  procedure  and  the  c  ^  procedure 
is  also  related  to  the  distribution  of  item  difficulty  levels.  Two 
items  which  are  independent  can  show  a  consistent  dominance  relation 
which  is  due  solely  to  difficulty  differences  between  the  two  items. 

For  example,  consider  two  items  that  are  independent  and  have  difficulty 
levels  of  .30  and  .90  computed  from  a  sample  of  100  persons.  The 
expected  number  of  dominances  of  item  1  over  item  2  is  equal  to 
100  x  p(failing  item  1  passing  item  2)  =  ( 100)  ( . 70)  ( . 90)  =  63. 

Likewise,  the  expected  number  of  dominances  of  item  2  over  item  1 
is  equal  to  3.  In  this  case,  £  =  7.39  and  c  ^  =  .91.  This  illustrates 
that  items  that  are  disparate  in  difficulty  will  tend  to  show  consistent 
dominance  relations  regardless  of  whether  or  not  they  belong  to  the 
same  factor. 

The  value  of  c  ^  for  the  above-mentioned  example  is  0.  This 
Illustrates  a  desirable  property  of  c^*  that  the  expected  number 
of  chance  dominances  (for  independent  items)  is  taken  into  consider¬ 
ation.  The  c  ^ procedure  is  also  less  prone  to  the  first  problem 
described  above  that  items  too  similar  in  difficulty  level  tend  not 
to  show  a  clear  dominance  relation. 

The  c  ^  j  procedure  yielded  chains  which  correctly  reflected  the 
factor  structures  for  all  seven  simulated  datasets.  It  was  found  to 
be  consistently  superior  to  both  the  c  j  and  Krus  &  Bart  procedures. 

The  better  performance  of  c  ^  compared  with  c  ^  is  in  agreement  with 
results  found  by  Cudck  (1980).  However,  for  the  mathematics  data. 
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the  c  ^  procedure  did  a  poor  job  of  reproducing  the  factorial  dimen¬ 
sionality.  Two  reasons  are  offered  for  this  finding.  First,  the 
mathematics  datasets  showed  a  fairly  strong  first  factor  as  evidenced 
by  the  magnitude  of  the  first  eigenvalues.  The  two-dimensional 
datasets  showed  no  strong  first  factor.  For  the  mathematics  data, 
c  may  have  been  unduly  influenced  by  the  first  factor,  which 
could  have  distorted  the  chain-extraction  process.  A  second  reason 
for  the  failure  of  c  to  reproduce  the  factors  for  the  mathematics 
data  concerns  the  correlation  between  the  factors.  The  factors  for 
the  simulated  datasets  were  all  orthogonal,  whereas  for  the  mathe¬ 
matics  data  the  factors  were  substantially  correlated  (e.g.  r  =  .46 
for  AMATH) .  In  the  case  of  correlated  factors,  the  c  ^  procedure 
may  not  be  able  to  distinguish  between  items  loading  on  different 
factors. 


Study  II 

An  attempt  was  made  to  develop  a  new  order-analysis  procedure 
which  alleviated  the  problems  of  current  procedures.  Study  I 
illustrated  three  major  shortcomings  of  current  order-analysis 
procedures  for  reproducing  factorial  dimensionality: 

1)  Items  from  the  same  factor  with  similar  difficulty  levels 
can  be  seen  as  being  inconsistent  (in  the  sense  of  showing 
about  as  many  dominances  as  counter-dominances)  and  are 
therefore  deemed  to  belong  to  different  dimensions. 

2)  Two  items  that  are  independent  can  show  a  consistent 
dominance  relation  which  is  due  solely  to  difficulty 
differences  between  the  items. 

3)  Order  analysis  of  a  set  of  items  with  an  oblique  factor 
structure  will  often  not  reproduce  the  factorial  dimensions. 
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The  new  order-analysis  procedure,  termed  ORDO,  was  designed 
specifically  to  address  the  first  two  of  these  problems.  Basically, 
ORDO  represents  an  amalgamation  of  Krus  and  Bart's  (1974)  procedure 
and  the  Reynolds  (1976)  procedure  using  c^.  Krus  and  Bart's  approach 
seemed  to  be  a  good  place  to  start  in  developing  a  new  procedure, 
as  it  "truly"  reflects  the  basic  order-analy tic  principles  of  items 
(and  persons)  forming  simple  orders.  Reynolds'  procedure,  on  the 
other  hand,  deals  with  the  consistency  of  an  item  set  which  is 
assumed  to  be  an  indicator  of  the  orderability  of  the  item  set.  In 
this  sense,  Reynolds'  approach  might  be  termed  an  indirect  order- 
analysis  procedure. 

ORDO  represents  a  radical  departure  from  other  order-analysis 
procedures  in  that  it  extracts  partial  orders  of  items  rather  than 
simple  orders  (see  Coombs,  et  al.,  1970).  The  connectedness  property 
of  simple  orders  creates  the  first  problem  with  order-analysis  pro¬ 
cedures  mentioned  above.  Considering  dimensions  as  partial  orders 
allows  for  two  items  to  fall  in  the  same  dimension  without  there 
necessarily  being  a  dominance  relation  between  them.  This  may  seem 
problematic,  as  the  lack  of  a  dominance  relation  between  two  items 
also  represents  the  primary  evidence  that  those  items  are  from 
different  dimensions.  However,  a  pair  of  items  from  the  same  dimen¬ 
sion  that  do  not  show  a  dominance  relation  have  another  characteristic 
high  proximity.  The  proximity  measure  used  is  the  squared  Euclidean 
distance  between  the  points  representing  the  two  items,  which  is 
also  equal  to  the  total  number  of  persons  for  which  one  of  the  two 
items  dominated  the  other.  If  two  items  are  close  together  on  the 
same  dimension,  few  persons  will  pass  only  one  of  them.  This  high 
proximity  characteristic  is  not  evident  for  pairs  of  items  which  do 
not  measure  the  same  dimension. 

The  basic  algorithm  for  ORDO  proceeds  as  follows.  Compute  the 
item  dominance  matrix  and  reorder  the  rows  and  columns  in  terms  of 
decreasing  item  difficulty  level.  Compute  McNemar's  jz  statistics 
for  each  item  pair,  as  well  as  chi-square  tests  for  association.  If 
the  values  of  and  chi-square  are  both  significant  then  conclude 
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that  a  true  relation  (beyond  that  attributable  to  difficulty  differ¬ 
ences)  exists  between  the  two  items.  If  either  or  both  are  not 
significant,  then  conclude  that  a  true  relation  is  not  present.  Next, 
use  the  relation  information  to  extract  a  chain  of  items  using  Krus 
and  Bart's  (1974)  method.  This  forms  what  is  termed  a  "skeleton" 
chain  of  items.  Items  are  then  added  to  the  chains  that  have  high 
proximity  to  one  of  the  skeleton  chain  members.  This  process 
results  in  each  skeleton  chain  member  and  items  added  to  it  being 
considered  as  an  equivalence  class,  where  items  between  equivalence 
classes  should  how  consistent  dominance  relations,  and  items  within 
equivalence  classes  should  not  show  consistent  dominance  relations. 

The  chain-extraction  process  is  then  repeated  for  items  which  are  not 
already  members  of  a  chain  until  all  items  are  placed  in  a  chain 
(singleton  chains  are  allowed) .  The  number  of  extracted  chains  is 
interpreted  as  the  dimensionality  of  the  dataset. 


Method  and  Results 

The  simulated  and  mathematics  datasets  described  in  Study  II  were 
order-analyzed  using  ORDO.  Although  the  results  for  the  simulated 
data  are  not  shown  here,  ORDO  correctly  reproduced  the  factors  for 
all  seven  datasets.  The  results  for  the  mathematics  data  are  shown  in 
Figures  la  and  lb.  For  the  UMATH  dataset,  ORDO  extracted  four  chains. 

Two  of  the  chains  were  equivalent  to  the  factors  found  for  the  two-factor 
solution  given  in  Table  6.  The  four  chains  were  labeled:  subtraction, 
addition  of  two  negative  numbers,  addition  of  two  numbers  with  opposite 
signs,  and  addition  of  two  positive  numbers.  For  the  AMATH  dataset, 

ORDO  extracted  two  chains  which  were  clearly  the  same  as  the  two 
factors  of  addition  and  subtraction.  For  both  datasets,  chains  containing 
addition  items  showed  few  equivalence  classes,  due  to  highly  similar 
means  for  those  items. 


19 


ORDER  I  : 

[4l 

’  1 

"  5  " 
9 

6 

’  7  " 

(Subtraction) 

.3. 

- - - 

,2  . 

- ^ 

- P*’ 

8  _ 

1 

1 

order  n : 

(Addition  of 
two  negative 
numbers) 
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12 


ORDER  HI  - 

(Addition  of 
numbers  with 
opposite  signs) 


13 

14 

15 

16 


ORDER 

(Addition  of 
two  positive 
numbers) 


Figure  1  a  Order  analysis  results  for  UMATH  dotaset  using  ORDO  (  brackets  denote 
equivalence  classes,  arrows  denote  dominances  ). 
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Figure  1b.  Order  analysis  results  for  AMATH  dataset  using  ORDO  (brackets  denote 
equivalence  classes,  arrows  denote  dominances  )- 
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Discussion 

The  results  of  this  study  support  the  use  of  ORDO  as  the  order- 
analysis  procedure  to  use  in  assessing  the  dimensionality  of  a  test. 

ORDO  matched  the  c^  procedure  in  reproducing  the  factors  present 
in  the  simulated  data,  and  it  outperformed  the  c  procedure  in 
determining  the  factor  structure  of  the  mathematics  data.  Apparently, 
ORDO  is  less  sensitive  than  the  c  procedure  to  oblique  factor 
structures  and/or  dominant  first  factors  in  a  dataset. 

The  main  motivation  for  extracting  unidimensional  subsets  of 
items  concerns  satisfying  the  unidimensionality  requirement  of  latent- 
trait  models.  Lord  &  Novick.  (1968)  state  that  if  performance  on  a 
set  of  items  has  an  underlying  multivariate  normal  distribution  and 
a  single  common  factor  is  present  in  a  matrix  of  tetrachoric  corre¬ 
lation  coefficients,  then  the  latent  space  is  unidimensional  and 
local  independence  holds.  In  this  study,  phi  coefficients  were  used 
rather  than  tetrachoric  coefficients.  There  are  two  persistent 
problems  with  tetrachoric  correlation  coefficients.  When  one  item 
dominates  another  item  in  a  perfectly  consistent  manner  (i.e.,  no 
counterdominances)  the  tetrachoric  correlation  is  equal  to  1.0. 

However,  since  in  most  cases  the  correlation  coefficient  is  calculated 
for  sample  data,  one  would  typically  be  reluctant  to  accept  1.0  as 
a  population  correlation  estimate.  Also,  matrices  of  sample  tetrachoric 
coefficients  will  often  be  non-Gramian,  in  violation  of  basic  assumptions 
of  the  factor-analytic  model.  Neither  of  these  problems  occur  when 
phi  coefficients  are  used.  While  phi  coefficients  are  influenced 
by  the  relative  difficulty  levels  of  the  items,  Comrey  (1973)  reported 
finding  the  influences  of  difficulty  factors  to  be  minor,  and  he 
endorsed  the  use  of  phi  rather  than  tetrachoric  coefficients. 

Hence,  phi  coefficients  were  deemed  to  be  appropriate  in  this  study. 

Order  analysis  avoids  many  of  the  problems  involved  in  factor 
analysis.  Also,  no  distributional  assumptions  are  required  in  the 
order-analytic  model.  This  study  has  shown  that  ORDO  can  yield 
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results  that  are  highly  similar  to  results  found  with  factor  analysis. 
Order  analysis  may  represent  a  very  desirable  alternative  to  factor 
analysis  in  assessing  the  dimensionality  of  tests. 

Certainly  more  research  is  necessary  to  determine  the  eventual 
usefulness  of  order  analysis  in  determining  item  sets  which  are 
appropriate  for  item  response  theory. 
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Appendix  3 

Factor  Analysis  Results  for  Dataset  LI 


Item  Mean  S.D.  Factor  I 

loadings 


1 

.480 

.500 

.866 

2 

.490 

.500 

.877 

3 

.470 

.500 

.848 

4 

.480 

.500 

.845 

5 

.488 

.500 

.846 

6 

.480 

.500 

.856 

7 

.470 

.500 

.831 

8 

.474 

.500 

.854 

9 

.480 

.500 

.868 

10 

.464 

.499 

.856 

Appendix  4 


Factor 

Analysis  Results 

for  Dataset 

H2 

Item 

Mean 

S.D. 

Factor  I 

Factor 

loadings 

loading: 

1 

.210 

.408 

-.049 

.591 

2 

.396 

.490 

.043 

.809 

3 

.592 

.492 

-.013 

.808 

4 

.808 

.394 

-.035 

.565 

5 

.242 

.429 

.639 

-.006 

5 

.290 

,454 

.711 

.046 

7 

.458 

.499 

.827 

.011 

8 

.576 

.495 

.832 

-.057 

9 

.716 

.451 

.736 

-.027 

10 

.824 

.381 

.588 

-.075 
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Appendix  5 

Factor  Analysis  Results  for  Dataset  M2 


Item 

Mean 

S.D. 

Factor  I 
loadings 

Factor  II 
loadings 

1 

.226 

.419 

.083 

.680 

2 

.416 

.493 

-.002 

.857 

3 

.436 

.496 

-.002 

.862 

4 

.718 

.450 

-.031 

.578 

5 

.122 

.328 

.667 

-.042 

6 

.106 

.308 

.633 

-.054 

7 

.400 

.490 

.765 

-.018 

8 

.404 

.491 

.762 

-.030 

9 

.904 

.295 

.371 

.073 

10 

.904 

.295 

.37  2 

.048 

Item 

Appendix 

Factor  Analysis  Results 

Mean  S.D, 

6 

for  Dataset 

Factor  I 
loadings 

L2 

Factor  II. 
loadings 

1 

.510 

.500 

-.010 

.856 

2 

.500 

.500 

-.018 

.832 

3 

.494 

.500 

-.013 

.852 

4 

.496 

.500 

.016 

.838 

5 

.498 

.500 

.839 

.009 

6 

.522 

.500 

.843 

-.005 

7 

.522 

.500 

.810 

-.001 

8 

.512 

.500 

.848 

-.015 

9 

.510 

.500 

.819 

-.015 

10 

.498 

.500 

.839 

-.012 
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